Robust Online i-Vectors for Unsupervised Adaptation of DNN Acoustic Models: A Study in the Context of Digital Voice Assistants

نویسندگان

  • Harish Arsikere
  • Sri Garimella
چکیده

Supplementing log filter-bank energies with i-vectors is a popular method for adaptive training of deep neural network acoustic models. While offline i-vectors (the target utterance or other relevant adaptation material is available for i-vector extraction prior to decoding) have been well studied, there is little analysis of online i-vectors and their robustness in multi-user scenarios where speaker changes can be frequent and unpredictable. The authors of [1] showed that online adaptation could be achieved through segmental i-vectors computed using the hidden Markov model (HMM) state alignments of utterances decoded in the recent past. While this approach works well in general, it could be rendered ineffective by speaker changes. In this paper, we study robust extensions of the ideas proposed in [1] by: (a) updating ivectors on a per-frame basis based on the incoming target utterance, and (b) using lattice posteriors instead of one-best HMM state alignments. Experiments with different i-vector implementations show that: (a) when speaker changes occur, lattice-based frame-level i-vectors provide up to 6% word error rate reduction relative to the baseline [1], and (b) online i-vectors are more effective, in general, when the microphone characteristics of test utterances are not seen in training.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speaker adaptation of DNN-based ASR with i-vectors: does it actually adapt models to speakers?

Deep neural networks (DNN) are currently very successful for acoustic modeling in ASR systems. One of the main challenges with DNNs is unsupervised speaker adaptation from an initial speaker clustering, because DNNs have a very large number of parameters. Recently, a method has been proposed to adapt DNNs to speakers by combining speaker-specific information (in the form of i-vectors computed a...

متن کامل

GMM-derived features for effective unsupervised adaptation of deep neural network acoustic models

In this paper we investigate GMM-derived features recently introduced for adaptation of context-dependent deep neural network HMM (CD-DNN-HMM) acoustic models. We improve the previously proposed adaptation algorithm by applying the concept of speaker adaptive training (SAT) to DNNs built on GMM-derived features and by using fMLLR-adapted features for training an auxiliary GMM model. Traditional...

متن کامل

Robust i-vector based adaptation of DNN acoustic model for speech recognition

In the past, conventional i-vectors based on a Universal Background Model (UBM) have been successfully used as input features to adapt a Deep Neural Network (DNN) Acoustic Model (AM) for Automatic Speech Recognition (ASR). In contrast, this paper introduces Hidden Markov Model (HMM) based ivectors that use HMM state alignment information from an ASR system for estimating i-vectors. Further, we ...

متن کامل

Context Adaptive Neural Network for Rapid Adaptation of Deep CNN Based Acoustic Models

Using auxiliary input features has been seen as one of the most effective ways to adapt deep neural network (DNN)-based acoustic models to speaker or environment. However, this approach has several limitations. It only performs compensation of the bias term of the hidden layer and therefore does not fully exploit the network capabilities. Moreover, it may not be well suited for certain types of...

متن کامل

An on-line acoustic compensation technique for robust speech recognition

In this work we report on the use of an on-line acoustic compensation technique for robust speech recognition. With this technique acoustic mismatch between training and actual conditions is reduced through acoustic mapping. At recognition stage, observation vectors delivered by the acoustic front-end are mapped into a reference acoustic space, while input data are exploited to update the stati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017